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Abstract 



The degree of a CSP instance is the maximum number of times that 
a variable may appear in the scope of constraints. We consider the ap- 
proximate counting problem for Boolean CSPs with bounded-degree 
instances for constraint languages containing the two unary constant 
relations {0} and {1}. When the maximum degree is at least 25 we 
' obtain a complete classification of the complexity of this problem. It is 

, exactly solvable in polynomial-time if every relation in the constraint 

' language is affine. It is equivalent to the problem of approximately 

O . counting independent sets in bipartite graphs if every relation can be 

expressed as conjunctions of {0}, {1} and binary implication. Other- 
wise, there is no FPRAS unless NP = RP. For lower degree bounds, 
' additional cases arise in which the complexity is related to the com- 

, plexity of approximately counting independent sets in hypergraphs. 

1 Introduction 

i> 

■ In the constraint satisfaction problem (CSP), we seek to assign values from 

, some domain to a set of variables, while satisfying given constraints on 

the combinations of values that certain subsets of the variables may take. 
Constraint satisfaction problems are ubiquitous in computer science, with 
close connections to graph theory, database query evaluation, type inference, 
satisfiability, scheduling and artificial intelligence [26,28,31]. CSP can also 
be reformulated in terms of homomorphisms between relational structures 
[21] and conjunctive query containment in database theory [26]. Weighted 
versions of CSP also appear in statistical physics, where they correspond to 
partition functions of spin systems [37]. 

We give formal definitions in Section [2] but, for now, consider an undi- 
rected graph G and the CSP where the domain is {red, green, blue}, the 
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variables are the vertices of G and the constraints specify that, for every 
edge xy G G, x and y must be assigned different values. Thus, in a satis- 
fying assignment, no two adjacent vertices are given the same colour: the 
CSP is satisfiable if, and only if, the graph is 3-colourable. As a second 
example, given a formula in 3-CNF, we can write a system of constraints 
over the variables, with domain {true, false}, that requires the assignment 
to each clause of the formula to satisfy at least one of the literals. Clearly, 
the resulting CSP is directly equivalent to the original satisfiability problem. 

1.1 Decision CSP 

In the uniform constraint satisfaction problem, we are given the set of con- 
straints explicitly, as lists of allowable combinations for given subsets of the 
variables; these lists can be considered as relations over the domain. Since 
it includes problems such as 3-SAT and 3-COLOURABILITY, uniform CSP is 
NP-complete. However, uniform CSP also includes problems in P, such as 
2-SAT and 2-COLOURABILITY, raising the natural question of what restric- 
tions lead to tractable problems. There are two natural ways to restrict 
CSP: we can restrict the form of the instance and we can restrict the form 
of the constraints. 

The most common restriction to CSP is to allow only certain fixed rela- 
tions in the constraints. The list of allowable relations is known as the con- 
straint language and we write CSP(r) for the so-called non-uniform CSP 
in which each constraint states that the values assigned to some tuple of 
variables must be a tuple in a specified relation in V. 

The classic example of this is due to Schaefer [32]. Restricting to Boolean 
constraint languages (i.e., those with domain {0, 1}), he showed that CSP(r) 
is in P if T is included in one of six classes and is NP-complete, otherwise. 
The Boolean case of CSP is often referred to as "generalized satisfiability" in 
the literature. More recent work by Bulatov has produced a corresponding 
dichotomy for the three-element domain [2]. 

These results of Schaefer and Bulatov restrict the size of the domain but 
allow relations of arbitrary arity in the constraint language. The converse 
restriction — relations of restricted arity over arbitrary finite domains — has 
also been studied in depth. In particular, the case where V consists of a single 
binary relation corresponds exactly to the directed graph homomorphism 
problem, and to undirected graph homomorphism if we further restrict the 
relation to be symmetric. In the latter case, Hell and Nesetfil have shown 
that CSP(-E) € P if E is the edge relation of an undirected bipartite graph 
or a graph with a self-loop and CSP(-E') is NP-complete if E is the edge 
relation of any other undirected graph [23]. The case where E is the edge 
relation of a loop-free directed graph is currently open. Hell and Nesetfil 
conjecture that, for any binary relation E, CSP(-E) is either in P or NP- 
complete [24, Conjecture 5.12], though we are not aware of any conjecture 
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on what form this dichotomy might take. 

Note that, for all T discussed above, CSP(r) has been either in P or 
NP-complete. Feder and Vardi have conjectured that this holds for every 
constraint language. Ladner has shown that no such dichotomy can exist 
for the whole of NP because, if P 7^ NP, there is an infinite, strict hier- 
archy between the two [29]. However, there are problems in NP, such as 
graph Hamiltonicity and even connectedness, that cannot be expressed as 
CSP(r) for any finite and Ladner's diagonalization does not seem to 
be expressible in CSP [21]. Resolving Hell and Nesetfil's conjecture for a 
certain class of relatively simple acyclic digraphs would immediately resolve 
the dichotomy for CSP [21]. 

Restricting the form of the instances has also been a fruitful direction 
of research. Dechter and Pearl [9] and Freunder [22] have shown that, if we 
restrict to instances of tree width at most any fixed k, while still allowing 
arbitrary constraint languages, then uniform CSP is decidable determin- 
istically in polynomial time; see also [27]. An alternative, incomparable, 
restriction is on the degree of instances, which is the maximum number of 
times that any variable appears in the scopes of constraints. However, not 
much is known: Dalmau and Ford have shown that, for any fixed Boolean 
constraint language T containing the constant unary relations i? zer o = {0} 
and -Rone = {1}, the complexity of CSP(r) for instances of degree at most 
three is exactly the same as the complexity of CSP(r) with no degree re- 
striction [8]. The case where variables may appear at most twice has not 
yet been completely classified; it is known that degree-2 CSP(r) is as hard 
as general CSP(r) in every case where T contains i? zer o and R one and some 
relation that is not a A-matroid [20]; the known polynomial-time cases come 
from restrictions on the kinds of A-matroids that appear in T [8]. 

1.2 Counting CSP 

A generalization of the classical constraint satisfaction problem is to ask how 
many satisfying solutions there are, rather than just whether the constraints 
are satisfiable. This is referred to as the counting CSP problem, #CSP. 
Clearly, the decision problem is reducible to counting: if we can efficiently 
count the number of solutions, we can efficiently determine whether there 
is at least one. However, the converse does not hold: for example, there 
are well-known polynomial-time algorithms that determine whether a graph 
admits a perfect matching but it is #P-complete to count the number of 
perfect matchings even in a bipartite graph [35]. 

The class #P can be considered the counting "analogue" of NP: it is 
defined to be the class of functions / for which there is a nondeterministic, 

1 Every CSP(F) is defined by a sentence of existential monadic second-order logic. How- 
ever, no formula of that logic defines the class of connected graphs [18, 19] and it follows 
from the proof of this that Hamiltonicity is also not definable. 
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polynomial-time Turing machine that has exactly f(x) accepting paths for 
input x [34]. It is easily seen that the counting version of any NP decision 
problem is in #P. Note that, although #P plays a similar role in the 
complexity of function problems as NP does for decision problems, problems 
that are complete for #P under appropriate reductions are, under standard 
complexity-theoretic assumptions, considerably harder than NP-complete 
problems: Toda has shown that P# p includes the whole of the polynomial 
hierarchy [33]. 

Although it is not known if there is a dichotomy for CSP, Bulatov has 
recently shown that, for every T, #CSP(r) is either computable in polyno- 
mial time or #P-complete [3]. However, Bulatov's dichotomy sheds little 
light on which constraint languages yield polynomial-time counting CSPs 
and which do not. The criterion of the dichotomy is based on "defects" in a 
certain infinite algebra built up from the polymorphisms of T and it is open 
whether the characterization is even decidable. 

So, although there is a full dichotomy for #CSP(r), results for re- 
stricted forms of constraint language are still of interest. In the case of 
Boolean constraint languages, Creignou and Hermann have shown that only 
one of Schaefer's polynomial-time cases survives the transition to counting: 
#CSP(r) has a polynomial time algorithm if every relation in T is affine 
(that is, if it is the solution set of a system of linear equations over GF2) 
and is #P-complete, otherwise [7]. It is not surprising that there are fewer 
tractable cases — it is easy to arrange that every instance of CSP(r) be 
trivially satisfiable (for example, by setting all variables to zero), but the 
number of non-trivial solutions might be difficult to compute. It is interest- 
ing that the tractable cases correspond precisely to affine cases. Dyer, Gold- 
berg and Jerrum [14] extended Creignou and Hermann's result to weighted 
Boolean #CSP. Cai, Liu and Xia [4] further extended the result to the case 
of complex weights and show that the dichotomy holds for the restriction 
of the problem in which instances have degree 3. Their result implies that 
the degree-3 problem #CSP3(r) (#CSP(r) restricted to instances of de- 
gree 3) has a polynomial time algorithm if every relation in T is affine and 
is #P-complete, otherwise. 

The case where T contains a single, symmetric binary relation E corre- 
sponds exactly to the problem of counting the number of homomorphisms 
from an input graph to some fixed undirected graph H, also known as 
the counting i^-colouring problem. Dyer and Greenhill have shown that 
#CSP({E}) is in polynomial time if E is a complete relation or defines a 
complete bipartite graph and is #P-complete otherwise [16]. The dichotomy 
for directed acyclic graphs has been characterized by Dyer, Goldberg and 
Paterson [15] but it is still an open problem to characterize the cyclic di- 
rected graphs that lead to tractable #CSP problems. In contrast to the 
decision problem, it is not known whether a direct proof of the dichotomy 
for general directed graphs would yield an alternative proof of the dichotomy 
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for arbitrary constraint languages. 

However, restricting the tree- width of instances has a dramatic effect. 
In the case of counting ii-colourings, restricting the instance to be a graph 
of tree-width at most k means that the problem is solvable in linear time 
for any graph H, a result due to Diaz, Serna and Thilikos [10]. This result 
follows immediately from Courcelle's theorem, which says that, if a decision 
problem is definable in monadic second-order logic (which //-colouring is, 
for any fixed H), then both it and the corresponding counting problem 
are computable in linear time [5,6]. However, invocations of Courcelle's 
theorem hide enormous constants in the notation 0(n) (in this case, a tower 
of twos of height \H\), while the work of Diaz et al. not only yields practical 
constants but can also be applied to classes of instances where the tree- width 
is allowed to grow logarithmically with the order of the graph, rather than 
being constant. 

1.3 Approximate counting 

Since #CSP(r) is very often ^P-complete, approximation algorithms play 
an important role. The key concept is that of a fully polynomial randomized 
approximation scheme (FPRAS). This is a randomized algorithm for com- 
puting some function f(x) that takes as its input x and a constant e > 0, 
computes a value Y such that e~ e ^ Y/f(x) ^ e e with probability at least 
| and runs in time polynomial in both \x\ and e _1 . (See Section 12.41 for 
details.) 

Dyer, Goldberg and Jerrum have classified the complexity of approxi- 
mately computing #CSP(r) for Boolean constraint languages [13]. In the 
case where all relations in T are affine, #CSP(r) can be computed exactly in 
polynomial time by the result of Creignou and Hermann discussed above [7]. 
Otherwise, if every relation in T can be defined by a conjunction of pins 
and Boolean implications, then #CSP(r) is as hard to approximate as the 
problem #BIS of counting independent sets in a bipartite graph; otherwise, 
#CSP(r) is as hard to approximate as the problem #SAT of counting the 
number of satisfying truth assignments of a Boolean formula. Dyer, Gold- 
berg, Greenhill and Jerrum have shown that the latter problem is complete 
for #P under appropriate approximation-preserving reductions (see Sec- 
tion E3J and has no FPRAS unless NP = RP [12], which is thought to 
be unlikely. The complexity of #BIS is currently open: there is no known 
FPRAS but it is not known to be #P-complete, either. #BIS is known 
to be complete with respect to approximation-preserving reductions in a 
logically-defined subclass of #P [12]. 
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1.4 Our result 

In this paper we consider the complexity of approximately solving Boolean 
#CSP problems when instances have bounded degree. Following Dalmau 
and Ford [8] and Feder [20] we consider the case in which -R zero = {0} and 
-Rone = {1} are available. Our main result (Corollary |24"1) is a trichotomy 
for the case in which instances have maximum degree d for some d ^ 25. If 
every relation in T is affine then #CSPd(ru{i2 zer o) -Rone}) is solvable in poly- 
nomial time. Otherwise, if every relation in T can be expressed as conjunc- 
tions of i?zeroj -Rone and binary implication, then ^CSP^r u {-R ZC roj -Rone}) 
is equivalent in approximation complexity to #BIS. Otherwise, it has no 
FPRAS unless NP = RP. Theorem 1231 gives a partial classification of the 
complexity when d < 25. In the new cases that arise here the complexity 
is given in terms of #u>-HISd, the complexity of counting independent sets 
in hypergraphs of degree at most d with hyper-edges of size at most w. 
The complexity of this problem is not fully understood. We explain what is 
known about it in Sectional 

1.5 Organization 

The remainder of the paper is organized as follows. In Section [21 we define 
the basic notation and relational operations that we will use throughout 
the paper. We formally define bounded-degree CSPs, and the properties of 
hypergraphs that we will use in the paper. In Section El we introduce the 
classes of relations that we will use throughout the paper and give some of 
their basic properties. A key tool in this type of work [4,20] is characterizing 
the ability of certain relations or sets of relations to assert equalities between 
variables: we show when this can be done in Section 01 The last piece 
of preparatory work is to show that every Boolean relation that cannot 
simulate equality in this way is definable either by a conjunction of pins and 
either ORs or NANDs, which is done in Section [5j Our classification of the 
complexity of bounded-degree Boolean counting CSPs follows, in Section 

2 Preliminaries 

2.1 Basic notation 

We write a for the tuple ( ai, . . . , a r ), which we often shorten to a = a\ . . . a r . 
We write a r for the r-tuple whose every element is equal to a and ab for the 
tuple formed formed from the elements of a followed by those of b. 

The the bit-wise complement of a relation R C {0, l} r is the relation 

R = {{ ai ® 1, . . . , a r ffi 1 ) | a e R} , 

where © denotes addition modulo 2. 
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We say that a relation R is ppp- definable^ in a relation R' and write 
R ^ppp R' if -R can be obtained from R' by some sequence of the following 
operations: 

• permutation of columns; 

• pinning (taking sub-relations of the form R{^ c = {a € R | dj = c} for 
some i and some c £ {0, 1}); and 

• projection ("deleting the ith column" to give the relation {ai . . . a%—\ 
aj+i . . . a r I ai . . . a r & R})- 

The three p's in ppp-definable refer to the initial letters of the words per- 
mutation, pinning and projection. Allowing permutation of columns is just 
a notational convenience: it clearly adds no expressive power. 

It is easy to see that ^ ppp is a partial order on Boolean relations and 
that, if R ^ P pp R\ then R can be obtained from R' by first permuting the 
columns, then making some pins and then projecting. 

We write R= = {00, 11}, R± = {01, 10}, R r = {01, 10, 11}, «nand = 
{00,01,10} and R^ = {00,01,11}. For k > 2, we write R- k = {0 fc ,l fc }, 
RoR,k = {0,l} k \ {0 fc } and i? N AND,fc = {0,l} fc \ {l fe } (i.e., fe-ary equality, 
OR and NAND). 

2.2 Boolean constraint satisfaction problems 

A constraint language is a set T = {Ri, . . . , R m } of named Boolean relations. 
Given a set V of variables, the set of constraints over T is the set Cons(y, T) 
which contains R(v) for every relation R G T with arity r and every v £ V r . 
Note that, if v and v 1 are variables, neither v = v' nor v ^ v' is a constraint, 
though of course R=(v,v') (R^(v,v')) is a constraint if R = € T (R^ £ T). 
The scope of a constraint R(v) is the tuple v. Note that the variables in the 
scope of a constraint need not all be distinct. 

An instance of the constraint satisfaction problem (CSP) over T is a set 
V of variables and a set C Q Cons(V, T) of constraints. 

An assignment to a set V of variables is a function a: V — > {0, 1}. An 
assignment to V satisfies an instance (V, C) if ( ct(di), . . . , a(v r ) ) G i? for 
every constraint of the form R(vi, . . . , v r ). Given an instance / of some CSP, 
we write Z{T) for the number of satisfying assignments. 

We are interested in the counting CSP problem #CSP(r) (parameterized 
by r), defined as follows: 

Input: an instance I = (V, C) of CSP over V. 
Output: Z{I). 

2 This should not be confused with the concept of primitive positive definability (pp- 
definability) which appears in algebraic treatments of CSP and #CSP, for example in the 
work of Bulatov [3]. 
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The degree of an instance is the greatest number of times any variable 
appears among its constraints. Note that the variable v appears twice in the 
constraint R(v,v). Our specific interest in this paper is in classifying the 
complexity of bounded-degree counting CSPs. For a constraint language T 
and a positive integer d, define #CSP<f (T) to be the restriction of #CSP(T) 
to instances of degree at most d. 

We can deal with instances of degree 1 immediately. 

Theorem 1. For any V, #CSPi(T) G FP. 

Proof. Because each variable appears at most once, the constraints are inde- 
pendent. Each constraint R(v±, . . . ,v r ) can be satisfied in \R\ ways and any 
variable that does not appear in a constraint can take value either or 1. 
The total number of assignments is the product of the number of ways each 
constraint can be satisfied, times 2 k , where k is the number of unconstrained 
variables. □ 

When considering bounded-degree CSPs with degree bounds of two or 
more, it is sometimes useful to simplify the situation by allowing pinning in 
the constraint language [8,20]. We write i? zer o = {0} and R onc = {1} for the 
two unary relations that contain only zero and one, respectively. We refer 
to constraints in i? zcro and i? nc as pins and we say that the single variable 
in the scope of a pin is pinned. To make notation easier, we will sometimes 
write constraints using constants instead of explicit pins. That is, we will 
write constraints of the form R(x±, . . . , x r ) where each Xi is either a variable 
from V or a constant or 1 (again, the Xi need not be distinct). Such a 
constraint can always be rewritten a set of "proper" constraints by replacing 
each instance of a constant or 1 with a fresh variable v and introducing 
the appropriate constraint R zero (v) or R onc (v). Note that every variable 
introduced in this way appears exactly twice in the resulting instance so if 
the degree of the CSP instance is at least two, the transformation does not 
increase the instance's degree. We let r p i n denote the constraint language 

zero j Ronc } • 

2.3 Hypergraphs 

A hypergraph H = (V,E) consists of a set V = V(H) of vertices and a 
set E = E(H) C V(V) of non-empty hyper-edges. The degree of a vertex 
v G V{H) is the number d(y) of hyper-edges it participates in: d(v) = \{e G 
E(H) | v G e}|. The degree of a hypergraph is the maximum degree of its 
vertices. If w = max{|e| | e G E(H)}, we say that H has width w. 

An independent set in a hypergraph H is a set S C V(H) such that 
e ^ S for every e G E(H). Notice, though, that we may have more than one 
vertex of a hyper-edge in an independent set, so long as at least one vertex 
is omitted. 

We write #w;-HIS for the following problem: 
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Input: a width- w hypergraph H 

Output: the number of independent sets in H 

and #w-HISrf for the following problem: 

Input: a width- w hypergraph H of degree at most d 
Output: the number of independent sets in H. 

2.4 Approximation complexity 

A randomized approximation scheme (RAS) for a function / : X* — > N is a 
probabilistic Turing machine that takes as input a pair (x, e) G X* x (0, 1), 
and produces, on an output tape, an integer random variable Y satisfying 
the condition Pr(e _e ^ Yj fix) ^ e e ) ^ |o A fully polynomial randomized 
approximation scheme (FPRAS) is a RAS that runs in time poly(|x|, e~ ). 

To compare the complexity of approximate counting problems, we use 
the AP-reductions of [12]. Suppose / and g are two functions from some in- 
put domain X* to the natural numbers and we wish to compare the complex- 
ity of approximately computing / to the problem of approximately comput- 
ing g. An approximation-preserving reduction from / to g is a probabilistic 
oracle Turing machine M that takes as input a pair (x, e) 6 X* x (0, 1), and 
satisfies the following three conditions: (i) every oracle call made by M is of 
the form (w, 5) where w € X* is an instance of g, and < 5 < 1 is an error 
bound satisfying <5 _1 ^ poly(|x|, e _1 ); (ii) M is a randomized approximation 
scheme for / whenever the oracle is a randomized approximation scheme for 
g; and (iii) the run-time of M is polynomial in jx| and e . 

If there is an approximation-preserving reduction from / to g, we write 
/ ^ap 9 and say that / is AP-reducible to g. If g has an FPRAS then so does 
/. If / ^ap 9 and g ^ap / then we say that / and g are AP-interreducible 
and write / =ap <?• 

3 Classes of relations 

A relation R C {0, l} r is affine if it is the set of solutions to some system of 
linear equations over GF2. That is, there is a set S of equations in variables 
X\ , . . . , x r where each equation has the form ig/ Xi = c, where © denotes 
addition modulo 2, I C [1, r] and c € {0, 1}, and we have a E R if, and only 
if, the assignment x\ 1— »■ a±, . . . , x r 1— ► a r satisfies every equation in E. Note 
that the empty relation is defined by the equation = 1 (©, e = 1) and 
the complete relation {0, l} r is defined by the empty set of equations. 

If a variable x\ occurs in an equation of the form xi = c, we say that it 
is pinned to c. 

3 The choice of the value | is inconsequential: the same class of problems has an FPRAS 
if we choose any probability i < p < 1 [25]. 
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Lemma 2. Let R C {0, l} r 6e affine. If R' ^ ppp -R, i/ien i?' is affine. 

Proof. Let i? be defined by the set £ of equations. Clearly, permuting 
the columns of R just permutes the indices of the variables. For any i € 
[l,r] and c G {0,1}, i?i^ c is defined by £ U {xj = c} so is affine. For 
projections, let R' be the result of projecting away i?'s r-th column. If 
x r does not appear in any equation in £, then S, considered as a set of 
equations over x±, . . . , x r _i already defines R'. Otherwise, we may take any 
equation involving x r , rewrite it as x r = c © ® ie /\{ r } %i and substitute the 
sum for x r in every equation where it occurs. The resulting set of equations, 
in variables xi, . . . , x r —\ defines R'. □ 

Let OR-conj be the set of Boolean relations that are defined by a con- 
junction of pins and ORs of any arity and NAND-conj the set of Boolean 
relations definable by conjunctions of pins and NANDs (i.e., negated con- 
junctions) of any arity. For example, the relation defined by the formula 

(xi = 0) A (X2 = 1) A 0R(X3, X4, X5, Xq) A 0R(X5,X7) 

is in OR-conj. We say that one of the defining formulae of these relations is 
normalized if 

• no pinned variable appears in any OR or NAND, 

• the arguments of each individual OR and NAND are distinct, 

• every OR or NAND has at least two arguments and 

• no OR or NAND's arguments are a subset of any other's. 

Note that the formula in the example above is normalized. 

Lemma 3. Every OR-conj (respectively, NAND-conj,) relation is defined by 
a unique normalized formula. 

Proof. We show the result for OR-conj relations; the case for NAND-conj is 
similar. 

Let R be an OR-conj relation defined by the formula The second 
and subsequent occurrences of any variable within a single clause can be 
deleted. Any clause that contains a variable pinned to one can be deleted; 
any variable that is pinned to zero can be deleted from any clause in which 
is appears. The disjunction OR(x) is equivalent to pinning x to one. If (/) 
contains a clause that is a subset of another, any assignment that satisfies 
the smaller clause necessarily satisfies the latter, which can, therefore, be 
deleted. This establishes that every OR-conj relation is defined by at least 
one normalized formula. 

To prove uniqueness, suppose that R C {0, l} r is defined by normalized 
formulae (f> and ip. The two formulae must obviously pin the same variables 
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and we may assume that none are pinned. Consider any clause in <p, which 
we may assume, without loss of generality, to be OR(xi, . . . ,Xfc). Since no 
clause of (j) is a subset of {xi, . . . ,Xk}, every other clause must include at 
least one variable from x^+i, ■ ■ ■ ,x r and, therefore, Q k - 1 \ r - k + 1 satisfies (j> 
and fc l r ~ fc does not. 

Now, suppose that this clause does not appear in ip. There are two cases. 
If ip contains a clause whose variables are a subset of {x±, . . . , x^}, which we 
may assume, without loss of generality, to be OR(xi, . . . , xg) for some t < k, 
then tp is not satisfied by o fe ~ 1 l r ~ A:+1 . Otherwise, every clause of tp contains 
at least one variable from Xk+i, ■ ■ ■ , x r , so o fc l r ~ fe satisfies ip. 

In either case, (ft and tjj define different relations. It follows that every 
clause that appears in <p must also appear in -ip. By symmetry, every clause 
that appears in must appear in <f> so the two formulae are identical. □ 

Given the uniqueness of defining normalized formulae, we define the 
width of an OR-conj or NAND-conj relation R to be wd(i?), the greatest 
number of arguments to any of the ORs or NANDs in the normalized formula 
that defines it. Note that, from the definition of normalized formulae, there 
are no relations of width 1. 

We define IM-conj to be the class of relations defined by a conjunction 
of pins and (binary) implications. This class is called IM2 in [13]. Say that 
a conjunction <fi of pins and implications that defines a relation R € IM-conj 
is normalized if no pinned variable appears in an implication and every 
implication has distinct arguments. 

Lemma 4. Every relation in IM-conj is defined by a normalized formula. 

Proof. Let R £ IM-conj be defined by the formula (ft. Any implication x — > x 
can be deleted as it does not constrain the value x. If the variable y is pinned 
to zero then the implication y — > z can be deleted and the implication z — > y 
can be replaced by pinning z to zero. If y is pinned to one, y — > z can be 
replaced by pinning z to one and z — > y can be deleted. Iterating, we can 
remove all implications involving pinned variables. □ 

Note that, in contrast to normalized OR-conj and NAND-conj formulae, 
normalized IM-conj formulae are not necessarily unique. For example, the 
following three normalized formulae all define the same relation: 

x — > y A y — > z A z — > x 
x — > z A z — > y A y — > z 
x — > y A y — > x A x — > z A z — > x . 

3.1 ppp-defining Boolean connectives 

Lemma 5. If R € IM-conj is not affine, then ^ppp fi- 
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Proof. Let R G IM-conj be defined by the normalized formula <f>. If there 
are variables x±, . . . ,x r such that <fi contains the implications x\ — > X2, ■ ■ ■ , 
x r _i — > x r and x r — > x\ then, in any satisfying assignment for <fr, the 
variables x±, . . . ,x r must take the same value. We may assume, then, that, 
if cj) contains such a cycle of implications, it also contains Xi — > Xj for every 
distinct pair Xi,Xj G {x±, . . . , x r }. 

There are two cases. First, if <p is symmetric (in the sense that, for 
every implication x — > y in (f), the formula also contains y — > x) then is 
equivalent to a conjunction of pins and equalities between variables, so R 
is affine. Otherwise, there must be at least one pair of variables such that 
x — ► y is a conjunct of but y — ► x is not. We ppp-define implication 
by pinning to zero every unpinned variable z such that there is a chain of 
implications z — > v±, v\ — > V2, ■ ■ ■ , v r -\ — > v r , v r — > x and pinning to one 
every other unpinned variable apart from x and ?/. Finally, project out the 
r — 2 constant columns. □ 

Lemma 6. If R £ OR-conj has width w, then i?oR,2, • • • , Ror,w ^ppp R- 
Similarly, if R G NAND-conj has width w, then -Rnand,2, • • • , -Rnand,™ ^ppp 
R. 

Proof. Let R G OR-conj have arity r and width w. Let R be defined by 
the normalized formula (f> which, without loss of generality, we may assume 
to contain the clause OR(xi, . . . , x w ). Since <fi is normalized, every other 
clause must contain at least one variable from x w+ ±, . . . ,x r . For any k with 
2 ^ k ^ w, we can ppp-define i?OR,fc by pinning Xk+i, . . . ,x w to zero and 
pinning x w+ i, ... ,x r to one. 

The proof for R G NAND-conj is similar. □ 

3.2 Characterizations 

The following proposition establishes a duality between OR-conj and NAND- 
conjrelations. Whenever we say that R is OR-conj or NAND-conj, it is 
equivalent to say that R or R is OR-conj. (Or, of course, that R or R is 
NAND-conj.) 

Proposition 7. A relation R C {0, l} r is in OR-conj if, and only if, R G 
NAND-conj. 

Proof. Suppose R is defined by the normalized formula 

l^j^m ielj 

where P is a conjunction of pins and Ii, . . . , I rn C [1, r]. Then R is defined 
by the formula 
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where P' is the conjunction of pins with the opposite values to those in P. 
This formula is equivalent to 

P' A f\ -./\xi, 

l^j^m ielj 

which is a NAND-conj formula, as required. □ 

Given tuples a, b G {0, l} r , we write a ^ b if a\ ^ 6j for all i G [l,r]. If 
a ^ 6 and a 7^ 6, we write a <b. 

We say that a relation R C {0, l} r is monotone if, whenever a £ R and 
a ^ 6, then b £ R. We say that i? is antitone if, whenever a £ R and 
6 ^ a, then b € R. That is, changing zeroes to ones in a tuple in a monotone 
relation gives another tuple in the relation; similarly, antitone relations are 
preserved by changing ones to zeroes. It is easy to see that R is monotone 
if, and only if, R is antitone. We say that a relation is pseudo-monotone 
(respectively, pseudo-antitone) if its restriction to non-constant columns is 
monotone (respectively, antitone). 

Proposition 8. A relation R C {0, l} r is in OR-conj (respectively, NAND- 
conj) if, and only if, it is pseudo-monotone (respectively, pseudo-antitone) . 

Proof. Suppose R G OR-conj, defined by some normalized conjunction <j> of 
disjunctions and pins. Let a G R and let b ^ a agree with a on all pinned 
columns, a already satisfies every conjunct of 4>; b satisfies all the pins and 
assigning 1 to more of the variables cannot cause any of the disjunctions to 
become unsatisfied. 

Towards the converse, suppose that R is (properly) monotone. For any 
a G R, let 1(a) = {i : ai = 1}. By monotonicity, any b ^ a is in R and to 
say that b ^ a is just to say that h = 1 for every i G 1(a). So, the formula 

0a(z) = f\ Xi 
iel(a) 

is satisfied by exactly the tuples b with b ^ a and, therefore, R is defined by 
the formula 

4>(x) = V <k{x) ■ 

a&R 

This formula is in disjunctive normal form but, since it contains no negations, 
it is equivalent to a negation-free formula in conjunctive normal form, which 
is to say, to an OR-conj formula. The case where R is only pseudo-monotone 
is easily dealt with by using pins for the constant columns. 
Finally, 

R G NAND-conj R G OR-conj 

R is monotone 

R is antitone. □ 
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4 Simulating equality 



An important ingredient in bounded-degree dichotomy theorems [4] is show- 
ing how to express equality using constraints from a constraint language that 
does not necessarily include the equality relation. In this section, we give 
the definitions that we need and some results about when equality can be 
expressed in our setting. 

Recall that, for all integers k ^ 2, R = j, is the fc-ary equality relation 
{0 fc , l k }. A constraint language T is said to simulate R=,k if, for some i ^ k 
there is an integer m ^ 1 and a (T U r pui )-CSP instance / with variables 
xi,...,X£ and such that / has exactly m satisfying assignments a with 
a(x\) = ■■■ = a(xk) = 0, exactly m with a(x±) = ■■■ = a{xk) = 1 and 
no other satisfying assignments. If, further, the degree of i" is d and the 
degree of each variable x±, . . . , x^ is at most d — 1, we say that V simulates 
R = ^ with d variable repetitions or, for brevity, that V d-simulates R =) k- We 
say that V d-simulates equality if it d-simulates R =: k f° r all k ^ 2. If only 
one relation R is involved in the simulation, we drop the curly brackets and 
say that R, rather than {R}, d-simulates equality. 

The point of this slightly strange definition is that, if T <i-simulates equal- 
ity, we can express the constraint y\ = • • • = y r in T U r p ; n and then use each 
Hi in one further constraint, while still having an instance of degree d. The 
variables xt+i, ■ ■ ■ , X£ in the definition function as auxiliary variables. Since 
they will not be used in any other constraint, we can use them d times each 
and still have an instance of degree d. 

Proposition 9. IfT d-simulates equality, then #CSP(T) ^ap #CSP,i(r U 

Tpiii)- 

Proof. Let / be an instance of #CSP(r). We produce a new CSP instance I' 
over the constraint language T augmented with R =) i constraints for certain 
values of i as follows. For each variable x that appears k > d times in /, 
replace the occurrences with new variables x±, . . . , x^ and add the constraint 
R=,k(x 1 ,...,x k ). Clearly, Z(I') = Z(I). 

Note that every variable in /' either occurs exactly once in an R=j- 
constraint and exactly once in a T-constraint or occurs in no R =t i constraints 
and at most d times in T-constraints. Since V d-simulates equality, we can 
replace the R= ^-constraints with (rur p i n )-constraints, using fresh auxiliary 
variables for each equality to give an instance I" of #CSP(r U Tpm) with 
degree d. There is some constant m, depending only on the number and 
arities of the equality constraints in I', such that Z(I") = mZ(I'). Since m 
can be computed in polynomial time, we have an AP-reduction. □ 

Lemma 10. Let R £ {0, l} r . // R= ^ ppp R, R^ ^ PPP R or R^ ^ ppp R, 
then R 3-simulates equality. 
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Proof. For each k ^ 2, we show how to 3-simulate R =tk . We may assume 
without loss of generality that the ppp-definition of R = , or R^ from R 
involves applying the identity permutation to the columns, pinning columns 
3 to 3+p— 1 inclusive to zero, pinning columns 3+p to 3+p+q— 1 inclusive 
to one (that is, pinning p ^ columns to zero and q ^ to one) and then 
projecting away all but the first two columns. 

Suppose first that R= ^ ppp R or R^ ^ PPP R- R must contain a ^ 1 
tuples that begin 000 p l 9 , /3 ^ that begin 010 p l 9 and 7 ^ 1 that begin 
110 p l 9 , with (3 = unless we are ppp-defining R^. 

We consider, first, the case where = 7, showing that we can 3-simulate 
i? =i fc, expressing the constraint R = ^{x\, . . . ,x k ) with the constraints 

R(x lX2 0n q *), R(x 2 x 3 p l q *), R(x k ^ 1 x k O p l q *),R(x k x 1 O p l q *) , 

where * denotes a fresh (r — 2 — p — g)-tuple of variables in each constraint. 
This set of constraints is equivalent to either x\ = ■ ■ ■ = x k = x\ or x\ — > 
. . . — > x k — > x\ so, in either case, constrains the variables x\,...,x k to 
have the same value, as required. Every variable appears at most twice and 
there are a k solutions to these constraints that put x\ = ■ ■ ■ = x k = 0, the 
same number with x\ = ■ ■ ■ = x k = 1 and no other solutions. Therefore, R 
3-simulates R=, k , as required. 

We now show, by induction on r that we can 3-simulate R =>k even in 
the case that a is not necessarily equal to 7. For the base case, r = 2, we 
have a = 7 = 1 and we are done. For the inductive step, let r > 2 and 
assume, without loss of generality that a > 7 (we are already done if a = 7 
and the case a < 7 is symmetric). In particular, we have a ^ 2, so there are 
distinct tuples 000 p l 9 a and 000 p l 9 & in R. R also contains a tuple 110 p l 9 c. 
Choose j such that cij 7^ bj. Pinning the (2 + p + q + j')th column of R to 
Cj and projecting out the resulting constant column gives a relation of arity 
r — 1 that still contains at least one tuple beginning 000 p l 9 and at least 
one beginning 110 p l 9 : by the inductive hypothesis, this relation 3-simulates 

R=, k - 

Finally, we consider the case that R-l ^ ppp R- R contains a ^ 1 tuples 
beginning 010 p l 9 and (5 ^ 1 beginning 100 p l 9 . We express the constraint 
R = k (x\, . . . , x k ) by introducing fresh variables y±, . . . , y k and using the con- 
straints 

R(x 1 y 1 p l q *),R(y 1 ,x 2 0n q *), 
R(x 2 y20n q *),R(y 2 ,x 3 p l q *), 

R(x k ^y k ^on q *),R(.yk-i, x k on q *), 

J R(x fe y fe p l'?*), J R(y fe ,x 1 p l«*), 

where * denotes a fresh (r — 2 — p — g)-tuple of variables in each constraint, 
as before. There are a k (5 k solutions when x± = ■ ■ ■ = x k = (and yi = ■ ■ ■ = 
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yk = 1) and (3 k a k solutions when the xs are 1 and the ys are 0. There are 
no other solutions and no variable is used more than twice. □ 

The following technical lemma and the definitions that support it are 
used only to prove Lemma [T2"l For c £ {0, 1}, an r-ary relation is c-valid if 
it contains the tuple c r . Given a relation R C {0, l} r , a tuple a G R that 
contains both zeroes and ones and a constant c G {0, 1}, let Ra )C be the 
result of pinning the set of columns {i | Oj = c} to c and then projecting out 
those columns. Observe that Ra yC is always (1 — c)-valid (because it contains 
the projection of a) and is c-valid if R is (because it contains the projection 
of c r ). 

Lemma 11. Let r ^ 3 and let R=, r C R C {0, l} r . There are a G i? and 
c G {0, 1} suc/t i/ia£ i?a jC is not complete. 

Proof. Suppose there is a tuple a £ R \ {0 r } such that changing some zero 
in o to a one gives a tuple a' R. Then does not contain the relevant 
projection of a' and we are done. Similarly, if there is a tuple b G R \ {l r } 
that leaves R by changing some one to a zero, then R^ is not complete. 
If no such tuple exists, then either R = {0, l} r or R = R =;r , violating our 
assumptions. □ 

Lemma 12. Let r ^ 2 and let R C {0, l} r 6e 0- and 1-valid but not com- 
plete. Then R 3-simulates equality. 

Proof. We show by induction on r, which must be at least two, that either 
R = or R^, is ppp-definable in R, and the result follows by Lemma flOl 

In the case r = 2, R is either i? = , i?^ or R^ with the columns swapped 
and we are done. For r ^ 3, if R = R =)T then R = is definable by projecting 
onto the first two columns. Otherwise, by Lemma [TTT there is some a G R 
and c G {0, 1} such that Ra, c is not complete. Since i? 5 , c ^ppp R and is 0- 
and 1-valid, we are done by the inductive hypothesis. □ 

We will next show that, if binary OR is ppp-definable in R and binary 
NAND in R\ then the constraint language {R, R'} 3-simulates equality (R 
and R' need not be distinct). To do this, we will use the following sets of 
constraints, for k ^ 2: 

6 = {RoR(xi,yi) | 1 ^ i ^ k}U 

{RNAND(Vi,Xi+l) \ 1 ^ i < k} U {i? N AND (Uk,Xi)} . 

The key point about these constraints is that they show that the language 
{i?OR> -Rnand} 3-simulates equality. 

Lemma 13. Let a: {xi, . . . ,x^,y\, . . . ,yk} — > {0, l} 2fc . a satisfies if, and 
only if 

<j(xi) = ■■■ = o-(xfc) / a(yi) = ■■■ = a(y k ) . 
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Proof. It is easy to check that assignments of the given type satisfy 
Conversely, suppose that a satisfies 

If cr(xi) = 0, we have cr(yi) = 1 because RoR(x\,yi) is satisfied and 
we must have a{x2) = because -Rnand(2/i, ^2) is satisfied. By a trivial 
induction, a(xi) = and a(yi) = 1 for all i. 

Otherwise, <r(x\) = 1. If cr(xi) = for any i > 1 then, by the same 
argument as above, cr(xj) = for all i € [1, k], contradicting the assumption 
that cr(xi) = 1. Therefore, = 1 for all i. In order to satisfy the 

constraints -Rnand^i, we must have cr(yj) = for all i. □ 

We now show that, in fact, we do not need to have -Ror and -Rnand in 
our constraint language V: it suffices to be able to ppp-define them from 
relations in T. 

Lemma 14. If Ror ^ppp R and -Rnand ^ppp R' then {R, R'} 3-simulates 
equality. 

Proof. Suppose first that R and R' are two distinct relations. We may 
assume, as in the proof of Lemma fTOl that the ppp-definition of i?oR from 
R involves performing some permutation and projecting to the first two 
columns after pinning the next p columns to zero and the q columns after 
that to one. We may suppose further that it is not possible to pin any more 
columns of R and still ppp-define Ror- Without loss of generality, we may 
assume the permutation to be the identity. 

Under these assumptions, R contains a 1 tuples beginning 0KF1 9 , 
[3 ^ 1 tuples beginning lOCPl 9 and 7^1 tuples beginning lKFl 9 , but none 
beginning OOCFl 9 . We first show that, if a ^ (3, then we are done because 
R-£ ^ P pp R, which means that R 3-simulates equality by Lemma [10] 

To this end, suppose a > (3 so, in particular, a ^ 2 and there are 
distinct tuples OKFl^a and 0W p l q b in R. We may assume, without loss of 
generality, that a\ ^b\. Since (3^1, there is at least one tuple 100 p l 9 c G R. 
Suppose, now that we pin the (2 + p + q + l)th column of R to c\. R 
cannot contain any tuple W$P\ q d with d\ = c\ because it is not possible 
to pin more columns and still ppp-define Ror. But then R contains tuples 
beginning with each of OlCPl'ci and 100 p l 9 ci and none beginning 000 p l 9 ci 
or llCPl^ci, so RjL ^ P pp R- We similarly have R^ ^ ppp R if a < f3. From 
this point, we may assume that a = (3. 

Similarly, either R^ ^ ppp R', in which case, we are done, or R' contains 
a' tuples beginning with each of 010 p l q and 10CP l q , 7' tuples beginning 
000 p l q and no tuples beginning 110 p l q . 

We now show how to simulate equality. We can 3-simulate R =: k by re- 
placing the constraint R = ^(x\, . . . , x^) with the following set of constraints, 
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modelled on 



S k = {R(x i y i (fl q *) | l<tO}U 

{R'(y i x i+1 (f'l< 1 '*) | 1 < i < k} U {iJ'^XiO^'l 9 '*)} , 

where the are fresh variables and, as before, * denotes a fresh tuple of 
variables for each constraint, of the appropriate length. By Lemma \13\ an 
assignment a satisfies if, and only if 

a(xi) = ■■■ = a{x k ) / a(yi) = ■■■ = a{y k ) ■ 

Further, there are a ways to satisfy the variables denoted by * in each R 
constraint and a' ways in each R' constraint. Therefore, there are (aa') k 
satisfying assignments for corresponding to each satisfying assignment 
for R =: k and we are done. 

Notice that our assumption that the ppp-definition of -Ror i n R an d 
-Rnand in R' involve the identity permutation, pinning sequential columns to 
zero and one and projecting to the first two columns only for the notational 
convenience of referring to "tuples beginning 0KF1 9 " and so on. This being 
the case, there is no requirement that R and R' be distinct, so the proof is 
complete. □ 

Note that there are relations, such as R= t 3 that 2-simulate equality, 
though we do not require this, here, so we omit the proof. 



5 Classifying relations 

We are now ready to prove that every Boolean relation R is in OR-conj, in 
NAND-conj or 3-simulates equality. Given r-ary relations Rq and R\, we 
write Ro + Ri for the relation {Oa | a G i?o}U{la | a G Ri}- The proof of the 
classification is by induction on the arity of R and proceeds by decomposing 
R as i?o + Ri ■ 

Lemma 15. Let Rq,R\ G OR-cory and let R = Rq + R%. Then R G 
OR- conj, R G N AND -conj or R 3-simulates equality. 

Proof. Let i?o an d R\ have arity r. We may assume that R has no constant 
columns. If it does, let R' be the relation that results from projecting away 
all the constant columns. Then R' = R' + R[, where both R' Q and R[ are 
OR-conj relations. By the remainder of the proof, R' G OR-conj, R' G 
NAND-conj or R' 3-simulates equality. Re-instating the constant columns 
does not alter this. 

For R with no constant columns, there are two cases. 
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Case 1. Rq Q R\- Suppose Ri is denned by the normalized OR-conj 
formula 0j in variables X2 ■ ■ ■ Then R is defined by the formula 



where the first equivalence is the distribution law and the second is because 
00 implies 0i (because Rq C Ri). Note that R\ cannot have any constant 
columns, since a constant column in R\ would have to be constant with the 
same value in Rq, contradicting our assumption that R itself has no constant 
columns. We consider the following two cases. 

Case 1.1. Rq has no constant columns. Since x% = 1 is equivalent to 
OR(xi) and 0o does not contain any pins, we can rewrite 0o V x\ = 1 in 
CNF, and hence the formula (pQ) defines an OR-conj relation. 

Case 1.2. Rq has a constant column. As argued above, the corresponding 
column of R\ cannot be constant. Suppose first that there is a constant 
column of Rq, say the kth column, that contains only zeros. Then the 
projection of R onto its first and (k + l)st columns gives the relation 
(with reversed columns), so R 3-simulates equality by Lemma fTOl Suppose 
second that all constant columns of Rq contain ones. Then 0o is i n CNF 
since every pinning Xi = 1 in 0o can be written OR(xj). We can therefore 
rewrite 0o V x\ = 1 in CNF, and hence the formula ([1]) defines an OR-conj 
relation. 

Case 2. Rq ^ R\. We will show that R 3-simulates equality or is in 
NAND-conj. We consider two cases (recall that no relation has width 1). 

Case 2.1. At least one of Rq and R\ has positive width. We consider the 
following two cases. 

Case 2.1.1. Ri has a constant column. Suppose the kth column of i?i is 
constant. If the kth column of Rq is also constant, then the projection of 
R to its first and (k + l)st columns is either equality or disequality (since 
the corresponding column of R is not constant) so R 3-simulates equality 
by Lemma [lOl Otherwise, if the projection of R to the first and {k + l)st 
columns is R^, then R 3-simulates equality by Lemma [TU1 Otherwise, that 
projection must be -Rnand- By Lemma [6] and the assumption of Case 2.1, 
Ror is ppp-definable in at least one of Rq and Ri so R 3-simulates equality 
by Lemma [Til 

Case 2.1.2. R± has no constant columns. By Proposition [HI R\ is mono- 
tone. Let a £ Rq\ Ri: by applying the same permutation to the columns 
of Rq and R\, we may assume that a = e l r ~ £ . We must have £ ^ 1 as 
every non-empty r-ary monotone relation contains the tuple l r . Let b £ R\ 
be a tuple such that = hi for a maximal initial segment of [1, r]. By 
monotonicity of R±, we may assume that b = k l r ~ k . Further, we must 



00 V(sci = 1 A<f>i) 



(0 o Vxi = l)A(0 o V0i) 
(00 V xi = 1) A 0i , 



(1) 
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have k < £, since, otherwise, we would have b <a, contradicting our choice 
of a £ R\. 

Now, consider the relation 

R = {a ai . . . ae- k \ a Q O k a 1 . . . a^ k \ r ~ t ' G R} , 

which is the result of pinning columns 2 to (k + 1) of R to zero and columns 
(r — i + 1) to (r + 1) to one and discarding the resulting constant columns. 
R' contains Q i ~ k+1 and \^~ k+1 but is not complete, since it does not contain 
10^ _fc . By Lemma W2\ R and, hence, R 3-simulates equality. 

Case 2.2. Both Rq and R\ have width zero, i.e., are complete relations, 
possibly padded with constant columns. For i G let R[ be the relation 

obtained from R by projecting onto its first and (i + l)st columns. Since R 
has no constant columns, R[ is either complete, R=, R-£, Ror, Rnand, R^ 
or the bit-wise complement of R^ . If there is a A; such that R' k is R= , R^ , 
R^ or the bit- wise complement of R^ then R = , R^ or R^ is ppp-defmable 
in R and hence R 3-simulates equality by Lemma [101 If there are k\ and 
k<2 such that R' ki = Ror and R' k = -Rnand then R 3-simulates equality by 
Lemma O It remains to consider the following two cases. 

Case 2.2.1. Each R[ is either Ror or complete. Ri must be complete, 
which contradicts the assumption that Rq % R\. 

Case 2.2.1. Each R[ is either -Rnand or complete. Rq must be complete. 
Let I = {i | R[ = -Rnand}- Then 

R= /\NAND(xi,x m ), 

so R G NAND-conj. □ 

The following corollary follows from Proposition \7\ and the facts that 
Ro + R\ = R\ + Rq and that, if R 3-deflnes equality, then so does R (since 
the equality relation is its own bit-wise complement). 

Corollary 16. Let R ,Ri G NAND-conj and let R = Rq + R x . Then 
R € OK-conj, R G NAND-conj or R 3-simulates equality. 

Theorem 17. Every Boolean relation is in OH-conj, is in NAND-conj or 
3-simulates equality. 

Proof. Induction on the arity, r. Every relation of arity less than two is 
both in OR-conj and in NAND-conj; similarly, every relation of arity two 
that has at least one constant column or is complete. The only other binary 
relations are Ror, which is OR-conj, .Rnand, which is NAND-conj, and R=, 
R^, and its bit- wise complement, all of which 3-simulate equality by 
Lemma [TUJ 
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For the inductive step, let R be a relation of arity r + 1 > 2 and let Rq 
and i?i be such that R = Rq + R\. By the inductive hypothesis, each of Rq 
and R\ is in OR-conj, in NAND-conj or 3-simulates equality. 

If either of Rq and R\ 3-simulates equality, then so does R. Otherwise, 
either both are in OR-conj, both are in NAND-conj or exactly one is in 
OR-conj and exactly one is in NAND-conj. In the first case, R is in OR-conj 
or in NAND-conj or 3-simulates equality by Lemma US] in the second case, 
R is in OR-conj or in NAND-conj or 3-simulates equality by Corollary [161 
and, in the third case, R simulates equality by Lemma UH □ 

6 Complexity 

The complexity of approximating #CSP(r) where the degree of instances is 
unbounded is given by Dyer, Goldberg and Jerrum. 

Theorem 18 ( [13, Theorem 3]). Let V be a Boolean constraint language. 

• // every R G T is affine, then #CSP(r) G FP. 

• Otherwise, if PC IM-conj, then #CSP(r) = A p #BIS. 

• Otherwise, #CSP(r) = A p #SAT. 

Working towards our classification of the approximation complexity of 
#CSP(r), we first deal with subcases. 

Proposition 19. IfTQ IM-conj contains at least one non-affine relation, 
then #CSP d (r U r pin ) =ap #BIS for all d^3. 

Proof. It is immediate from [13, Lemma 9] that #CSPrf(rur p i n ) ^ap #BIS. 

For the converse, first observe that, by [13, Lemma 8], #BIS ^ap 
#CSP({i?^}) and, since R^ 3-simulates equality by Lemma [TOj we have 
#CSP({rJ}) ^ap #CSP d ({R^} U r pin ) for all d > 3 by Proposition M It 
remains to show that #CSP d ({i?^} U r pin ) sC A p #CSP rf (r U r pin ). 

To this end, let R be any non-affine relation in T. By LemmaO i?_> ^ ppp 
R and the ppp-definition involves projecting only pinned columns. There- 
fore, we can express the constraint R^(x,y) by a constraint of the form 
R(v\, ... ,v r ), where, for some i and j, vi = x and Vj = y and the other 
variables are pinned to zero or one. □ 

Lemma 20. For d ^ 2 and w ^ 2, #w-HIS d = A p #CSP d ({RoR,w} U 

Tpin) =AP #CSPrf({-RNAND,i«} U Tpin)- 

Proof. The second equivalence is trivial, since RoR tW and -Rnand,™ are bit- 
wise-complementary. 

For the first equivalence, let H be an instance of ^w-HISd- We cre- 
ate an instance of #CSP d({RoR,w} U r p i n ) as follows. The variables are 
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{x v I v € V(H)} and, for each hyper-edge {vi, . . . ,v s }, there is a constraint 
Ror,w(x Vi , . . . , x Vg , 0, . . . , 0). Each vertex appears in at most d hyper-edges 
so each variable appears in at most d constraints. It is easy to see that a 
configuration a of the resulting #CSPd({-RoR,u>}ur p m) instance is satisfying 
if, and only if, {v \ a(x v ) = 0} is an independent set in H. 

Conversely, if we are given an instance of #CSP d({RoR,w} U r p i n ), we 
create an instance H of #w-mS d as follows. There is a vertex v x for every 
variable x. For every constraint Ror,w{xi, ■ ■ ■ , x w ) (where the not 
necessarily distinct), add the hyper-edge {^,...,1;^}. Now, for every 
constraint R zero (x), delete the vertex v x and remove it from every hyper- 
edge that contains it. For every constraint R onc (x), delete v x and delete 
every hyper-edge that contains it. It is easy to see that a configuration a 
is satisfying if, and only if, it satisfies the pins and the set {v x \ a(x) = 
0} fl V(H) is independent in H. □ 

In the following two propositions, we just prove the OR-conj cases; the 
NAND-conj cases are equivalent. 

Proposition 21. Let R be an OR-conj or NAND-conj relation of width w. 
Then for d>2, #w-mS d ^ A p #CSP d ({R} U r pin ) . 

Proof. By Lemma El i?oR,«i ^ppp R an d the ppp definition involves pinning 
and then projecting away all but w of the columns. Thus, a -RoR,«r cons t ra i n t 
can be simulated by an i?-constraint in which some elements of the scope 
are constants. The result follows from Lemma EU1 □ 

Proposition 22. Let R be an OR-conj or NAND-conj relation of width w. 
Then for d ^ 2, #CSPd({i?} U r p i n ) ^ap Jfcw-HISkd, where k is the greatest 
number of times that any variable appears in the normalized formula defining 
R. 

Proof. Given an instance / of #CSPrf({i?} U r p i n ), we produce an instance 
/' of the problem #CSP({i?oR,2) ■ ■ ■ , Ror,w} ur p j n ) with the same variables 
by replacing every i?-constraint with the i?oR,j-constraints and pins corre- 
sponding to the normalized formula that defines R. Clearly, Z(I) = Z(L') 
but a variable that appeared d times in / might appear kd times in I', so we 
have established that #CSP d ({#}ur pin ) ^ A p #CSP fcd ({^ R,2, • • • , #or,«,}U 
r p m) ^ap #CSPfcd({i?oR,u>}ur p i n ), where the last reduction follows the fact 
that an i?oR,«r cons traint with zero-pins can replace any i?oR,s- cons t rarn t 
for s < w. By Lemma EDJ #CSP kd ({R R, w } U r pin ) = AP #^-HIS M . □ 

We now give the complexity of approximating #CSPd(rur p i n ) for d ^ 3. 

Theorem 23. Let T be a Boolean constraint language and let d ^ 3. 

• If every ReT is affine, then #CSP d (r U r pin ) G FP. 
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• Otherwise, ifTC IM-conj, then #CSP d (T U r pin ) = AP #BIS. 

• Otherwise, if T C OR-conj or T C NAND-conj, then let w be the 
greatest width of any relation in T and let k be the greatest number of 
times that any variable appears in the normalized formulae defining the 
relations ofT. Then #ui-HISd ^ap #CSPd(T U r p i n ) ^ap ifw-FLlSkd- 

• Otherwise, #CSP d (r U r pin ) = AP #SAT. 

Proof. The affine case follows immediately from Theorem [181 Note that 
ru r p j n is affine if, and only if Y is. 

Otherwise, suppose V C IM-conj and some R € Y is not affine. By 
Proposition [H #CSP d (r U r pin ) = AP #BIS. 

Otherwise, suppose that T C OR-conj or T C NAND-conj. Then 
#w-RlS d ^ A p #CSP d (r U r pin ) s^ A p #w-RlS kd by Propositions E3 and [23 

Finally, suppose that T is not affine, T ^ IM-conj, T ^ OR-conj and 
r ^ NAND-conj. Since (rur p j n ) is neither affine or a subset of IM-conj, we 
have #CSP(r U r p i n ) =ap #SAT by Theorem [THl so. if we can show that T 
d-simulates equality, then #CSP^(r U r p i n ) =ap #CSP(rur p i n ) by Propo- 
sition [9] and we are done. If T contains a R relation that is neither OR-conj 
nor NAND-conj, then R 3-simulates equality by Theorem 1171 Otherwise, V 
must contain distinct relations R\ G OR-conj and R2 £ NAND-conj that 
are non-affine so have width at least two. So V 3-simulates equality by 
Lemma [TH □ 

Dyer, Frieze and Jerrum have shown that no FPRAS can exist for the 
problem of counting independent sets in graphs of maximum degree at 
least 25, unless NP = RP [11]. Clearly, if there is no FPRAS for the prob- 
lem of counting independent sets in such graphs, there can be no FPRAS for 
#w-HISd with r ^ 2 and d ^ 25. Further, since #SAT is complete for ^tP 
with respect to AP-reducibility [12], #SAT cannot have an FPRAS unless 
NP = RP. From Theorem 1231 above we have the following corollary. 

Corollary 24. Let T be a Boolean constraint language and let d ^ 25. 

• // every R £T is affine, then #CSP d (r U r pin ) G FP. 

• Otherwise, ifTC IM-conj, then #CSP d (r U r pin ) = AP #BIS. 

• Otherwise there is no FPRAS for #CSP d (rur pin ), unless NP = RP. 

Note that T U r p in is affine (respectively, in OR-conj or in NAND-conj) 
if, and only if T is. Therefore, the case for large-degree instances (d ^ 25) 
corresponds exactly in complexity to the unbounded case [13]. 

The case for lower degree bounds is more complex. To put Theorem [53] 
in context, we give a summary of what is known about the approximability 
of ftw-HlSd for various values of d and w. 

The case d = 1 is clearly in FP (Theorem [1]) and so is the case d = w = 
2, which corresponds to counting independent sets in graphs of maximum 
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Degree d 


Width w 


Approximability of #u>-HISd 


1 


> 2 


FP 


2 


2 


FP 


2 


5* 3 


FPRAS [17] 


3 


2,3 


FPRAS [17] 


3,4,5 


2 


PTAS [36] 


6,. ..,24 


^ 2 


The MCMC method is likely to fail [11] 


^ 25 


> 2 


No FPRAS unless NP = RP [11] 



Table 1: A summary of known approximability of #u>-HISd- For values of 
d and w not covered by the table, the approximability is still unknown. 

degree two. For d = 2 and width w 3, Dyer and Greenhill have shown 
that there is an FPRAS for ^uj-HIS^ [17]. For d = 3, they have shown that 
there is an FPRAS if the the width w is at most 3. For larger width, the 
approximability of #u>-HIS3 is still not known. With the width restricted to 
w = 2 (normal graphs), Weitz has shown that, for degree d S {3, 4, 5}, there 
is a deterministic approximation scheme that runs in polynomial time (a 
PTAS) [36]. This extends a result of Luby and Vigoda, who gave an FPRAS 
for d ^ 4 [30]. For d > 5, approximating #w-HISd becomes considerably 
harder. More precisely, Dyer, Frieze and Jerrum have shown that for d = 6 
the Monte Carlo Markov chain technique is likely to fail, in the sense that 
"cautious" Markov chains are provably slowly mixing [11]. They also showed 
that, for d = 25, there can be no polynomial-time algorithm for approximate 
counting, unless NP = RP. These results imply that for d € {6, . . . , 24} 
and w ^ 2 the Monte Carlo Markov chain technique is likely to fail and for 
d ^ 25 and w ^ 2, there can be no FPRAS unless NP = RP. Table □ 
summarizes the results. 

Returning to bounded-degree #CSP, the case d = 2 seems to have a 
rather different flavour to degree bounds three and higher. This is also the 
case for decision CSP — recall that the complexity of degree-d CSP(rur p i n ) 
is the same as unbounded-degree CSP(rur p i n ) for all d ^ 3 [8], while degree- 
2 CSP(r U 

Tpin) is often easier than the unbounded-degree case [8, 20] but 
there are still constraint languages V for which the complexity of degree-2 
CSP(r U r pin ) is open. 

Our key techniques for determining the complexity of #CSPd(r U r p i n ) 
for d ^ 3 were the 3-simulation of equality and Theorem \T7\ which says 
that every Boolean relation is in OR-conj, in NAND-conj or 3-simulates 
equality. However, it seems that not all relations that 3-simulate equality 
also 2-simulate equality so the corresponding classification of relations does 
not appear to hold. It seems that different techniques will be required for 
the degree-2 case. For example, it is possible that there is no FPRAS for 
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#CSP3(r U r p i n ) except when T is affine. However, Bubley and Dyer have 
shown that there is an FPRAS for the restriction of #SAT in which each 
variable appears at most twice, even though the exact counting problem is 
#P-complete [1]. This shows that there is a class C of constraint languages 
for which #CSP 2 (r U r pin ) has an FPRAS for every r G C but for which no 
exact polynomial-time algorithm is known. 

We leave the complexity of degree-2 #CSP and of #BIS and the the 
various parameterized versions of the counting hypergraph independent sets 
problem as open questions for future research. 
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